iiililllllilliliilllllillilllllillllll 

us 20020152244A1 

(19) United States 

(12) Patent Application Publication (lo) Pub, No,: us 2002/0152244 Ai 

Dean et al. (43) Pub. Date: Oct 17, 2002 



(54) METHOD AND APPARATUS TO 

DYNAMICALLY CREATE A CUSTOMIZED 
USER INTERFACE BASED ON A 
DOCUMENT TYPE DEFINITION 

(75) Inventors: Sara Elo Dean, New York, NY (US); 

Dikran S. Meliksetian, Daobury, CT 
(US); Louis Weitzman, Brookline, MA 
(US) 

Correspondence Address: 

FLEIT, KAIN, GIBBONS, 

GUTMAN & BONGINI, P.L. 

ONE BOCA COMMERCE CENTER 

551 NORTHWEST 77TH STREET, SUITE Ul 

BOCA RATON, FL 33487 (US) 

(73) Assignee: INTERNATIONAL BUSINESS 
MACHINES CORPORATION, 

ARMONK, NY (US) 

(21) Appl. No.: 09/748,716 

(22) FUed: Dec. 22, 2000 

Publication Classification 

(51) Int. Cl7 G06F 17/21 

(52) U.S. CI 707/530; 707/513 



(57) ABSTRACT 

A method on an information processing unit performing 
steps for creating a user interface (UI) to assemble a docu- 
ment that conforms to a particular document type definition. 
The method hides the specific syntax of document type 
definitions such as DTDs and schemas from the user. The 
method begins with a selection from a user for a document 
type or an existing document. Once the document type is 
selected or determined from the existing document the 
document type definitions are retrieved. Tlie document type 
definitions include one or more elements. The method parses 
the elements which are subsequently mapped to one or more 
interface controls such as icons, pull-down menus, buttons, 
selection boxes, progress indicators, on-off checkmarks, 
scroll bars, windows, window edges for resizing the win- 
dow, toggle buttons, forms, and UI widgets. UI can be GUIs 
or interactive voice response systems. A UI editor is pre- 
sented by assembling the one or more interface controls 
without presenting specific document type definition syntax 
to a user. The UI editor permits the user to create and edit the 
content objects that are associated with the interface con- 
trols. The content objects are aggregated in an XML com- 
patible format and ready to be checked in for further 
processing. The method permits specific UI interfaces to be 
created for specific publishing environments and at the same 
time permit the creation of reusable content objects. 
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METHOD AND APPARATUS TO DYNAMICALLY 
CREATE A CUSTOMIZED USER INTERFACE 
BASED ON A DOCUMENT TYPE DEFINITION 

CROSS-REFERENCE TO RELATED 
APPUCATIONS 

[0001] Not Applicable. 

PARTIAL WAIVER OF COPYRIGHT 

[0002] All of the material in this patent application is 
subject to copyright protection under the copyright laws of 
the United States and of other countries. As of the first 
effective fiUng date of the present application, this material 
is protected as unpublished material. However, permission 
to copy this material is hereby granted to the extent that the 
copyright owner has no objection to the facsimile reproduc- 
tion by anyone of the patent documentation or patent dis- 
closure, as it appears in the United States Patent and 
Trademark OfiBce patent file or records, but otherwise 
reserves all copyright rights whatsoever. 

BACKGROUND OF THE INVENTION 
[0003] 1. Field of the Invention 

[0004] The present invention relates to the field of com- 
puterized publication of documents, and more particularly to 
a method for publishing docimients using XML on networks 
such as the Word Wide Web and the ability to publish 
documents for different device types such as computers, 
PDAs, cell phones and print, 

[0005] 2. Description of the Related Art 

[0006] Web sites often present content which is constantly 
changing. Presenting current information to the outside 
world without requiring an inordinate amount of human 
effort and computing power is a major technical challenge to 
Web site designers. 

[0007] Multimedia content including text, graphics, video 
and sound on the Internet needs to be highly adaptive. 
Recently the World Wide Web Consortium (W3C) adopted 
the Extensible Markup Language (XML) as a universal 
format for structured documents and data on the Web. The 
base specifications are XML 1.0, W3C Recommendation 
February *98. See onHne URL (www.w3.org) for more 
information. A content management system based on XML 
along with (Extensible Stylesheet Language) XSL enforces 
separation of content and presentation, thus allowing flex- 
ible rendering of the content to multiple device types. 
Similarly, such a content management system allows maxi- 
mal reuse of information and data through the composition 
of XML fragments as well as ensures data integrity through 
the consistent use of information. 

[0008] In addition to the availability of XML, new inter- 
faces and devices are emerging, the diversity of users is 
increasing, machines are acting more and more on users' 
behalf, and net activities are possible for a wide range of 
business, leisure, education, and research activities. 

[0009] Systems and methods are being developed for 
generating more flexible content and a capability to manage 
frequent changes to content. One system for achieving 
maximum flexibility and reuse is disclosed in the patent 
application entitled "Method and System for EflSciently 



Constructing And Consistently Publishing Web Documents*' 
filed on Apr. 4, 1999 with application Ser. No. 09/283,542 
with inventors JR Challenger et al. now [Pending] and 
commonly assigned herewith to International Business 
Machines. Disclosed is a system and method where the 
multimedia content is broken down into fragments that can 
be combined into published documents. 

[0010] The use of XML in content management systems 
introduces the following new challenges: 

[0011] 1. A need exists to maintain information about 
the functional and semantic role of each richly tagged 
firagmenl. This information describes what the content 
is about, who the target audience is, and its relationship 
to a taxonomy or other fragments. The same mecha- 
nism should support efiBcient searches of particular 
fragments. 

[0012] 2. A need exists for an efficient method to track 
the effects of changes in a particular richly tagged 
fragment or style and propagate those changes through- 
out the information space. 

[0013] 3. A need exists for a user interface that shields 
the content contributor from knowing the underlying 
syntax and complexities of the XML documents; 

[0014] 4. A need exists for finding relevant document 
fragments on demand, keeping track of the dependen- 
cies between document fragments, transforming com- 
binations of those document firagments into viewable 
pages available to multiple device types, and designing 
a content creation tool that does not overwhelm the 
contributor with the details and the complexities of the 
underlying system. 

[0015] Accordingly, a need exists for a system and method 
that manages and publishes the information content of a Web 
site, or an Internet information portal, in a way that separates 
the information from the fonn and reuses the stored infor- 
mation and enables the presentation in the \iser interface to 
be customized for different audiences and target devices and 
media. 

[001 6] Other prior art systems/tools that relate to the XML 
editing include markup languages that use XML to declara- 
tively specify user interfaces, fiilly functioning editors, and 
systems that publish XML documents. Blucstone Software's 
XwingML [for more information refer to URL www.blue- 
stoae.com] enables the creation of Java Swing user inter- 
faces without coding. The GUI (Graphical User Interface) is 
declarative ly specified in XML and is translated into work- 
ing Java code. This approach separates the GUI code from 
the application logic. Their DTD specifies the entire set of 
classes and properties for all of Swing components. How- 
ever, the Bluestone Software's XwingML creates arbitrary 
interfaces in a declarative fashion rather than creating spe- 
cific interfaces that reflect the document types for a given 
publishing environment. Accordingly a need exists for a 
method and tool to accomplish creating specific interfaces 
that reflect the document types for a given publishing 
environment. 

[0017] Another prior art editor for XML is XmetaL, from 
Softquad, [refer to online URL www.xmetalcom] which is 
a flexible XML editor that supports three views into XML 
files. These views include raw XML mode, Tags- On mode 
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that provides a WYSIWYG presentation with direct access 
to elements and attributes, and a full WYSIWYG mode in a 
word-processor like environment. The XmetaL tool 
although useful has the problem that separate style sheets 
need to be used to support the editing vs. the publishing 
process. In addition, one stylesheet may not include all of the 
elements that would be used on other platfonns or for 
different uses. Accordingly, a content editor is needed that 
separates the content from presentation and the reusability of 
that content on different delivery environments such as PCs, 
PDAs and phones. 

[0018] Still another prior art content editor system is 
Interwoven [refer to online URL www.interwoven.com] 
which is a complete publishing system that supports HTML 
as well as XML. It provides an end to end solution from 
content creation to promotion and publishing. It also has a 
templating tool that provides the means to produce form- 
based pages. However, its support of reusable fragments 
within the environment is rather limited and the publishing 
to viewable pages is performed using non-standard methods. 

[0019] Accordingly a need exists for a method and tool to 
accomplish creating and reusing content fragments using 
standard methods for a given publishing environment. 

SUMMARY OF THE INVE>rnON 

[0020] The system for end-to-end content publishing 
using XML with an object dependency graph is based on the 
following two design principles: First, separation of content 
and style: Information stored in the content management 
system is independent on how it is going to be presented. 
The presentation style is encapsulated elsewhere and can be 
used to customize the look and feel based on the end-user 
preferences as well as the delivery methods and devices. 
Second, reusability of information content: By encapsulat- 
ing common information in fragments and subfragments and 
making these fragments insertable in other fragments, 
thereby avoid scattering and duplication of information. 
This enables a user to restrict the edit operations to a limited 
number of relevant fragments, to affect global changes. In 
addition, the present invention provides data consistency 
and data integrity in the content management. 

[0021] The implementation of the system is based on the 
following: 

[0022] 1. Standards based design: The different com- 
ponents of the system interact through well-defined 
APLs based on industry standards, such as: XML, 
XSL, WebDAV, HTTP, DASL. 

[0023] 2. Pervasive use of XML: XML is used not only 
as the content model but also as the language in which 
information is transferred between the different parts of 
the system. 

BRIEF DESCRIPTION OF THE DRAWING(S) 

[0024] The subject matter which is regarded as the inven- 
tion is particularly pointed out and distinctly claimed in the 
claims at the conclusion of the specification. The foregoing 
and other objects, features, and advantages of the invention 
will be apparent from the following detailed description 
taken in conjunction with the accompanying drawings. 

[0025] FIG. 1 is a schematic of a computer system used in 
practicing an embodiment the invention. 



[0026] FIG. 2 is a block diagram showing relationships 
among a set of fragments and compound objects. 

[0027] FIG. 3 is a block/flow diagram of a system/method 
for efiSciently constructing and publishing objects in accor- 
dance with the present invention. 

[0028] FIG. 4 is a block diagram showing a relationship 
between a set of fragments and compound objects in accor- 
dance with the present invention. 

[0029] FIG. 5 is an object dependence graph (ODG) 
corresponding according to FIG. 4, in accordance with the 
present invention; and 

[0030] FIG. 6 is a flow diagram for a method for consis- 
tently publishing objects in accordance with the present 
invention. 

[0031] FIG. 7 is a block diagram of the various software 
components operating on the server of FIG. 1, according to 
a preferred embodiment of the present invention. 

[0032] FIG. 8, shown is a GUI to enable the creation/ 
modification of multimedia content, according to the present 
invention. 

[0033] FIG. 9 is a GUI illustrating how elements pre- 
sented can be replicated, according to the present invention. 

[0034] FIG. 10 is a functional block diagram of the overall 
process of the publishing system using XML with an object 
dependency graph of FIG. 5, according to the present 
invention. 

[0035] FIG, 11 is a functional block diagram of the create 
document template process of FIG. 10, according to the 
present invention. 

[0036] FIG. 12 is a functional block diagram of the checks 
in document process of FIG. 10, according to the present 
invention. 

[0037] FIG. 13A is a process flow for the cUent editor 
GUI that builds the GUI interfaces as shown in FIGS. 8 and 
9 used in the overall process flow of FIG. 10, according to 
the present invention. 

[0038] FIG. 13B is a process flow for the client editor GUI 
that checks-in the document after being constructed into the 
process flow of FIG. 12, according to the present invention. 

DESCRIPTION OF A PREFERRED 
EMBODIMENT(S) 

[0039] It is important to note that these embodiments are 
only examples of the many advantageous uses of the inno- 
vative teachings herein. In general, statements made in the 
specification of the present apphcation do not necessarily 
hmit any of the various claimed inventions. Moreover, some 
statements may apply to some inventive features but not to 
others. In general, unless otherwise indicated, singular ele- 
ments may be in the plural and visa versa with no loss of 
generahty. 

[0040] In the drawing like numerals refer to like parts 
through several views. 

[0041] Exemplary Network— 100 

[0042] Referring to FIG. 1, a schematic of a computer 
system 100 used in connection with an embodiment of the 
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present invention is depicted. One or more client editor 
computers 102 and 106 or information processing systems 
ate connected to a network, Intranet or Internet 110 through 
bidirectional data links 104 and 108. A server 114, which 
operates according to the teachings of the invention as 
described hereinafter is connected to the Internet 110 
through a third bidirectional data link 112. Bidirectional data 
links 104, 108, and 112 can for example comprise dial up 
modem connectioD, Digital Subscriber Lines (DSL), Tl 
Lines, direct connections and other Local Area Network 
(LAN) segments. The client editor computers 102 and 106 
and the server can for example be IBM PC compatible 
computers. The present invention can be embodied in a 
removable computer readable medium drive such as a floppy 
diskette, CD, DVD or equivalent. The client computers 102, 
106 can be loaded with Web browser software such as 
Netscape Navigator, by America Online of Dulles, Va. or 
Internet Explorer, by Microsoft of Redmond, Wash. The 
Web browser software can serve as a user interface through 
which information is re ad- in from an information providing 
user and a problem posing user, and through which infor- 
mation is output to the aforementioned users. 

[0043] A removable computer readable memory medium 
in the form of a diskette 116 is provided for loading software 
onto the knowledge repository server 114. The software that 
configures the repository server and carries out processes 
according to the present invention which will be described 
below with reference to flow diagrams shown in the FIGS. 

[0044] Discussion of Hardware and Software Implemen- 
tation Options 

[0045] The present invention, as would be known to one 
of ordinary sldll in the art could be produced in hardware or 
software, or in a combination of hardware and software. The 
system, or method, according to the inventive principles as 
disclosed in connection with the preferred embodiment, may 
be produced in a single computer system having separate 
elements or means for performing the individual functions 
or steps described or claimed or one or more elements or 
means combining the performance of any of the functions or 
steps disclosed or claimed, or may be arranged in a distrib- 
uted computer system, interconnected by any suitable means 
as would be known by one of ordinary skill in art. 

[0046] According to the inventive principles as disclosed 
in connection with the preferred embodiment, the invention 
and the inventive principles are not limited to any particular 
kind of computer system but may be used with any general 
purpose computer, as would be known to one of ordinary 
skill in the art, arranged to perform the functions described 
and the method steps described. The operations of such a 
computer, as described above, may be according to a com- 
puter program contained on a medium for use in the opera- 
tion or control of the computer, as would be known to one 
of ordinary skill in the art. The computer medium which may 
be used to hold or contain the computer program product, 
may be a fixture of the computer such as an embedded 
memory or may be on a transportable medium such as a disk, 
as would be known to one of ordinary skill in the art, 

[0047] The invention is not limited to any particular 
computer program or logic or language, or instruction but 
may be practiced with any such suitable program, logic or 
language, or instructions as would be known to one of 
ordinary skill in the art. Without limiting the principles of 



the disclosed invention any such computing system can 
include, inter alia, at least a computer readable medium 
allowing a computer to read data, instructions, messages or 
message packets, and other computer readable information 
from the computer readable medium. The computer readable 
medium may include non-volatile memory, such as ROM, 
Flash memory, floppy disk. Disk drive memory, CD-ROM, 
and other permanent storage. Additionally, a computer read- 
able medium may include, for example, volatile storage such 
as RAM, buffers, cache memory, and network circuits. 

[0048] Furthermore, the computer readable medium may 
include computer readable information in a transitory state 
medium such as a network link and/or a network interface, 
including a wired network or a wireless network, that allow 
a computer to read such computer readable information. 

[0049] Overview of Trigger Monitor 

[0050] This invention presents a system and method for 
publishing documents, for example Web documents, effi- 
ciently and consistently. This method may be used at a wide 
variety of Web sites of the World Wide Web. The present 
invention may be applied to systems outside the Web as 
well, for example, where compound objects are oonstmcted 
from fragments. A fragment is an object which is used to 
construct a compound object. The term "document frag- 
ment*' or just "fragment" is used throughout this patent to 
refer to these reusable information objects. Which in their 
simplest form are an XML fragments. An object is an entity 
which can either be published or is used to create something 
which is publishable. Objects include both fragments and 
compound objects. A compound object is an object con- 
structed from one or more fragments. 

[0051] In generating Web content, publishable Web pages 
known as servables may be consUiicted from simpler frag- 
ments. A servable is a complete entity which may be 
published at a Web site. Publishing an object means making 
it visible to the public or a community of users. Publishing 
is decoupled from creating or updating an object and gen- 
eraUy takes place after the object has been created or 
updated. It is possible for a servable to embed a fragment 
which in turn embeds another fragment, etc. 

[0052] While fragments significantly increase the capa- 
biUties of a Web site, a number of problems may arise which 
need to be solved, including the following: 

[0053] (1) When changes to underlying data occur, how 
does the system determine all objects affected by the 
change? 

[0054] (2) How does the system determine a correct and 
efficient order for updating fragments and servables? 

[0055] (3) How can a system consistently pubhsh Web 
pages in the presence of fragments? For an illustrative 
example, refer to FIG, 2. Suppose that servables SI and 
S2 both embed the same fragment fl. If fl changes, 
updated versions of SI and S2 must be published 
concurrently; otherwise, the site will look inconsistent. 
However, the consistency problem is worse than just 
determining if a set of pages all embed the same 
fragment. For example, suppose SI and S3 both embed 
fragment f2. If f2 changes, updated versions of both SI 
and S3 must be published concurrently. However, if 
both fl and f2 change, updated versions of SI, S2, and 
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S3 must be published concurrently, even though S2 and 
S3 might not embed a common fragment. 

[0056] A method for solving problem (1) is described in a 
commonly assigned patent application, U.S. Ser. No. 
08/905,114, entitled "Determining How Changes to Under- 
lying Data Affect Cached Objects" by J. Challenger, P. 
Dantzig, A. Iyengar, and G. Spivak, The current invention 
solves problems (2) and (3). 

[0057] It sho;ild be understood that the elements shown in 
FIGS. 3 and 6 may be implemented in various forms of 
hardware, software or combinations thereof unless other- 
wise specified. Preferably, these elements are implemented 
in software on one or more appropriately programmed 
general purpose digital computers having a processor and 
memory and input/output interfaces. Referring now to the 
drawings in which like numerals represent the same or 
similar elements and initially to FIG. 3, a block/flow 
diagram of a system/method for efficiently constructing and 
publishing one or more servables in accordance with the 
present invention is shown. In block 300, the system main- 
tains an object dependence graph (ODG) which is a directed 
graph with objects corresponding to nodes/vertices in the 
graph. A dependence edge from a to b, for example, indi- 
cates that a change to object a also affects object b. The edge 
also implies that a should be updated before b after a change 
which affects the values of both a and b occurs. 

[0058] Dependence edges may preferably be used to iden- 
tify the following: 

[0059] a. The objects affected by a change to under- 
lying data. 

[0060] b. The order in which objects are desired or 
needed to be updated. 
[0061] In one illustrative example, FIG. 4 depicts three 
Web pages, PI, P2, and P4. P3 is a fragment embedded in 
PI and P2. And P5 is a sub-fragment embedded in P3. 
Similarly, PO is a fragment embedded in P4. An arrow "A" 
from PI to P4 indicates that PI has a hypertext link to P4. 
In the illustrative example, FIG. 5 depicts an object depen- 
dence graph (ODG) corresponding to the objects in FIG. 4, 
The ODG indicates that any change to PO also changes the 
value of P4. It also indicates that any change to P5 or P3 also 
changes both PI and P2. Since P4 includes PO, PO should be 
constructed before P4 when PO changes. Similarly, P3 
should be updated before both PI and P2 when P3 changes. 
Id addition, P5 should be recursively updated before both P3 
changes and prior to PI and P2 changing. 

[0062] Whenever objects change, the system is notified in 
block 310. The system will be notified of a set of objects C 
which have changed. Changes to objects in C will often 
imply changes to other objects as well; the system applies 
graph traversal algorithms to detect all objects which have 
changed and an efficient order (or partial order) for com- 
puting changed objects. In block 320, a set of all objects S 
affected by the change is determined by a topological sort (or 
partial sort )of all (or some) nodes reachable from C by 
following edges in the ODG, Topological sorting of S orders 
the vertices so that whenever there is a path from a to b, a 
appears before b. A topological sorting algorithm is pre- 
sented in Introduction to Algorithms by Cormen, Leiserson, 
and Rivest, MIT Press, 1990, Cambridge, Mass., incorpo- 
rated herein by reference, Othcrtopological algorithms may 
also be employed. 



[0063] In block 330, objects in S are updated in an order 
consistent with the topological sort performed in block 320. 

[0064] In block 340, objects are published. In one method, 
all servables are pubUshed in S concurrently. This avoids 
consistency problems. Another method publishes some serv- 
ables in S before others, i,e. incremental publication. There 
are a number of reasons why incremental pubhcation may be 
desirable. These reasons may include: 

[0065] (1) In a number of environments, publishing 
documents after the documents are updated may be 
time-consuming. Incremental publication may make 
certain documents available sooner than would be the 
case using the all-at-once approach. 

[0066] (2) It is conceivable that some environments 
may have constraints on the number of documents 
which can be published atomically. The incremental 
approach reduces the number of documents which need 
to be published in single atomic actions. 

[0067] Incremental publishing may be more difficult to 
implement than the all-at-once approach because of the need 
to satisfy consistency constraints such as the ones described 
earlier. 

[0068] Referring to FIG. 6, a method for incrementally 
publishing objects, for example, Web pages, which satisfies 
one or more consistency constraints described earlier is 
shown. In step 610, a consistency graph is created which 
includes servables as vertices/nodes. Edges of the consis- 
tency graph are referred to as consistency edges. A consis- 
tency edge from a servable c to another servable d indicates 
that d should not be published before c. Consistency edges 
do not imply the order in which c and d are to be generated. 
A consistency edge exists if there were a hypertext link from 
d to c and both d and c are in S. Such a link does not imply 
that c must be constructed before d, only that c should be 
published before or concurrently with d. It is entirely pos- 
sible that data dependence edges indicate that d should be 
constructed before c even though c should be published 
before or at the same time as d. 

[0069] Consistency edges are also used to indicate that 
two servables both embed a common fragment whose value 
has changed and thus are to be published concurrently. If c 
and d both embed a common fragment whose value has 
changed, then a consistency edge from c to d and d to c 
should exist. 

[0070] It is now explained how to determine whether two 
servables both embed a common changed fragment. As a 
node a in S is constructed in the order defined by the 
topological sort in block 330, a set of comprising -nodes is 
computed for a. Comprising-nodes(a) includes identifiers for 
nodes in S which affect the value of a. Comprising-nodes(a) 
is the union of b and oomprising-nodes(b) for edges (b, a) 
which terminate in a where b is a member of S. 

[0071] A directed graph T is now created including serv- 
ables in S (S is the set of all objects which have changed)and 
consistency edges. For two servables a and b in S, an edge 
from a to b exists in T if: 

[0072] (1) A hypertext hnk from b to a exists, or 

[0073] (2) a and b both embed a common changed 
fragment. This is true if comprising-nodcs(a) and com- 



03/31/2004, EAST Version: 1.4.1 



us 2002/0152244 Al 



5 



Oct. 17, 2002 



prising-nodes(b) have a node in common. In this case, 
a consistency edge from both a to b and b to a exist, 

[0074] In step 620, graph traversal algorithms are used on 
T to topologically sort T and find its strongly connected 
components. A strongly connected component of T is a 
maximal subset of vertices T such that every vertex inT has 
a directed path to every other vertex in T. The previously 
cited book, Introduction to Algorithms^ by Gormen, et al. 
includes an algorithm for finding stroiig^y connected com- 
ponents. Other algorithms for finding strongly connected 
components may also be employed. Each strongly con- 
nected component of T corresponds to a set of servables 
which can be published together. 

[0075] In step 630, servables are published in the follow- 
ing order: Examine servables of T in topological sorting 
order. For a servable a of T, if a was part of a previously 
published strongly connected component, go to the next 
servable. Otherwise, publish all servables corresponding to 
the strongly connected component including a in an atomic 
action. 

[0076] An extension of this algorithm may be to use either 
more or fewer consistency constraints in the method 
depicted in FIG. 6. Another extension may be to enhance the 
method to try to prevent publication of pages with broken 
hypertext links. The present invention may be extended to 
the publication of documents including but not limited to 
Web pages. 

[0077] A quick publishing and censoring system and 
method which may be used is described in "METHOD AND 
SYSTEM FOR RAPID PUBUSHING AND CENSORING 
INFORMAnON", Attomey docket number Y0999- 
040(8778-753), filed concurrently herewith, commonly 
assigned and incorporated herein by reference. A system and 
method which may be used for publishing Web documents 
is described in "METHOD AND SYSTEM FOR PUBLISH- 
ING DYNAMIC WEB DOCUMENTS", Attomey docket 
number YO999-039(8778-754), filed concurrently herewith, 
commonly assigned and incorporated herein by reference. 

[0078] Functional Block Diagram of Various Software 
Components — 700 

[0079] FIG. 7 is a block diagram 700 of the various 
software components operating on the server 114 of FIG. 1, 
according to a preferred embodiment of the present inven- 
tion. 

[0080] The system consists of the following main com- 
ponents; 

[0081] 1. Client editor application GUI 702 

[0082] 2. Dispatcher 704 

[0083] 3. MetaStore Manager 710 

[0084] 4. File system manager 708 

[0085] 5. Content Store Manager 706 

[0086] The communication protocols between the differ- 
ent components are based on industry standards: WebDAV 
(World Wide Web Distributed Authoring and Versioning), 
DASL (Distributed Authoring Search Language), and HTTP 
(Hypertext Transfer Protocol). XML is used not only for 
creating the multimedia content, but also for system con- 
figuration documents at startup and as the language for 



information exchange between the different parts of the 
system. Now each of these software components 700 are 
described in further detail below. 

[0087] Client Editor GUI— 702 

[0088] Client editor application GUI 702 running on client 
systems 102 and 106 that allows content creators to interact 
with the server 114. In one embodiment, the client editor 
GUI 702 is a standalone java application and in another 
embodiment the client GUI 702 is a Web-browser based 
interface. The GUI 702 allows the content creator to interact 
with the system 114. Through the client GUI 702, the user 
can create new documents, search for existing documents, 
check-out documents, check them back in after modifica- 
tion, and publish them. In addition, the client application 
also allows for previewing of the Web pages that will be 
created from the XML documents, 

[0089] Data Model 

[0090] As previously described above, the present inven- 
tion operating on server 114 manages two types of content 
objects, fragments and servables. A fragment is a content 
object that can be reused on several pages: 

[0091] A simple fragment is an XML file that contains 
only text data and metadata, for example a product 
specification. 

[0092] A compound fragment is a simple fragment that 
contains a pointer to an accompanying file, such as a 
video or image file, an XSL style sheet, or a hand- 
crafted HTML page. 

[0093] An index fragment is an automatically updated 
XML file that indexes any number of servables, for 
example the five latest press releases. 

[0094] A composite fragment is a simple fi-agment that 
contains references and imports content from one or 
more fragments. 

[0095] A servable is a composite fragment that contains 
references to one or more style sheet fragments, which 
allow it to be transformed into one or more final 
published pages. 

[0096] Each fragment type and servable type has an asso- 
ciated DTD (A document type definition (DTD) is a specific 
definition that follows the rules of the Standard Generalized 
Markup Language) that describes the structure of the XML 
document. The DTD specifies both metadata elements and 
content elements. In another embodiment, schemas specify 
the definition of the document structure. The DTD must 
abide to some constraints imposed by the present invention. 
The root element has a child node that is common to all 
documents called SYSTEM with the children: 

[0097] FRAGMENTID, CREATOR, MODIFIER, 
CREATJONTIME, LASTMODIFIEDTIME, PAG- 
ETYPE and CONTENTSIZE. 

[0098] These elements are shared across all documents 
and comprise the common metadata used in searches. These 
elements are not displayed in the interface, since their value 
can be inferred from the context. Additional metadata, such 
as KEYWORD and CAFEGORY, are provided by common 
DTD elements to allow functional and semantic categoriza- 
tion of the fragments. 
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[0099] The metadata elements are used both at author-time 
and run-time. At author- time the metadata elements are used 
for categorization of fragments and for efficient searches of 
subfragments. At run-time, the same metadata elements can 
be used to perform personalization in a dynamic Web site. 

[0100] A fragment can include other fragments as sub- 
fragments. This enables the reuse of content. To accomplish 
inclusion of a subfragment, the entity reference that defines 
all subfragment types must be included in the DTD. Cur- 
rently, the declaration of a subfragment contains the SUB- 
FRAGMENTTYPE attribute set to the appropriate docu- 
ment type, as illustrated in the following example: 

[0101] <!ENnTY SUBFRAGMENTTYPES SYS- 
TEM 

[0102] "http://fserver/dtd/subfragmenttypes.txt"> 

[0103] <! ELEMENT SUBFRAGMENT 
(#PCDArA)> 

[0104] <!AnUST SUBFRAGMENT SUBFRAG- 
MENTTYPE 

[0105] (%SUBFRAGMENTTYPES;) 
«IMAGEFRAGMENr'#FIXED> 

[0106] where server is the name of the server 114. 

[0107] This piece of a DTD specifies that a particular type 
of subfragment, IMAGEFRAGMENT, is needed as content 
for the element SUBFRAGMENT. The subfragment syntax 
will be replaced by the XLink syntax as it becomes a W3 
recommendation and XML parser and XSL transformation 
engines support the syntax. 

[0108] In the present invention, servables always result in 
one or more final published pages. The DTD of a servable 
indicates the names of the XSL stylesheets that can be used 
for layout for that particular type of document. 

[0109] Because the servable includes content from sub- 
fragments, the stylesheet is written to work on the so-called 
expanded servable. Before page assembly, a servable is 
temporarily rewritten to include the content of all its sub- 
fragments. Thus the system implements a temporary solu- 
tion that mimics the XLink functionality by expanding the 
servable. 

[0110] In one embodiment, an IBM DB/2'^/UDB data- 
base is used to store metadata that can be used either at 
author-time or run-time. In one embodiment, the mapping of 
the metadata elements of the XML document to the colunms 
of the relational database is performed using the DB/2 XML 
Extender package. For each DTD, a Document Access 
Definition (DAD) is defined that specifies this mapping. The 
DAD is itself an XML document that abides to a particular 
DTD. Each DAD defines the relationship between the hier- 
archical structure of the XML document and the columns 
and tables of the relational database. The DB/2 XML 
Extender package \ises the DAD to decompose the input 
XML document into the columns, or to compose an XML 
document from selected columns. A second embodiment 
that does not rely on DAD consists of the programmatic 
mapping of the XML elements into the database columns. 

[0111] In summary, the addition of a new document type 
to the system requires the definition of a DTD and the 



corresponding metastore mapping. If the document is a 
servable, stylesheets defined in XSL are also required. 

[0112] Automated User Interface Creation 

[0113] One of the biggest diallenges of any publishing 
system is to remove as much complexity from the users* 
tasks as possible. When dealing with a relatively new 
technology like XML/XSL this aspect of the system 
becomes even more important. By hiding the syntax of XML 
from the editors and authors, domain experts can take on the 
role of creating and modifying the content without worrying 
about the syntax of a particular markup language. 

[0114] When using the Content Editor 702, the tagging 
syntax is never presented to the user. Instead, the present 
invention creates a set of input forms that the user can easily 
fill out. However, some users require placing simple HTML 
markup into text fields. The present invention does allow a 
small subset of HTML tags to be processed. However, this 
defeats many of the reusability and cross-platform publish- 
ing opportunities and is not a recommended strategy. 

[0115] Users are assigned roles in the system and each 
role, in turn, is assigned specific document types. A user 
assigned to an edit role can only create or modify a docu- 
ment assigned to that role. When the user selects a document 
type to create or edit, the Content Editor 702 reads in the 
DTD and automatically constructs an interface based on that 
document stmcture. A user assigned to a publish role can 
only publish a document assigned to that role. 

[0116] DTD to Interface 

[0117] In this present invention, the term "interface con- 
trols" or "GUI widget" or just "widget" is used to describe 
an element of a GUI 702 that displays information or 
provides a specific way for a user to interact with the 
operating system and application. Widgets include icons, 
pull-down menus, buttons, selection boxes, progress indi- 
cators, on-off checkmarks, scroll bars, windows, window 
edges (that let you resize the window), toggle buttons, 
forms, and many other devices for displaying information 
and for inviting, accepting, and responding to user actions. 

[0118] The Content Editor 702 creation algorithm for the 
GUI 702 first constructs the basic interface from the DTD. 
This algorithm recursively adds widgets, such as textbox or 
dropdown hst, to the display as necessary. If a new XML 
document is being created, empty widgets are created. As 
the editor enters content, the widgets are interactively filled 
in. However, if an interface is generated from an existing 
XML document, the existing content is displayed in the 
widgets. In addition, if elements arc repeated in the existing 
XML docim^ent, additional widgets are generated in the 
interface as needed. 

[0119] The present invention uses a number of assump- 
tions in handling DTDs and the automatic creation of the 
user interface. Most notably, special attributes are used to 
assist in the transformation of an XML element into an 
appropriate interface widget. In one embodiment, the inter- 
face widgets are created for DTD elements, not for DTD 
attributes and a special type attribute for these elements 
enables the transformation into an appropriate interface 
widget. 

[0120] Until XML schemas (see online URL 
www.w3.org) become widely adopted, there is no standard 



03/31/2004, EAST Version: 1.4.1 



us 2002/0152244 Al 



7 



Oct. 17, 2002 



way to provide data typing for elements in the DTD, The 
present invention solves this problem by including the 
attribute, DATATYPE, whenever an element is to be dis- 
played in the interface If an element does not contain a 
DATATYPE attribute no widget is created in the interface 
for that element. Children elements, however, may still 
contain DATATYPE attributes to specify their user interface. 
In addition, whenever an element has the DATATYPE 
attribute, it contains a child of type PCDATA. Thus, through 
typing the DTD can specify, for example, whether a one line 
input, a medium text area or a large text area is required. 

[0121] In the partial DTD shown here, TITLE, SHORT- 
DESCRIPTION, and BODY each specify different text 
input widgets to use. 

[0122] <!ELEMENT TITLE (#PCDArA)> 

[0123] <! ELEMENT SHORTDESCRIPTION 
(#PCDArA)> 

[0124] <!ELEMENT BODY (#PCDATA)> 

[0125] <!ArTUST TITLE DATATYPE 

[0126] (%UrrYPES;) "STRING"#FIXED> 

[0127] <!ATTUST SHORTDESCRIPTION 
DATAYTPE 

[0128] (%UTrYPES;) "SHORTTEXr'#FIXED> 

[0129] <!ArTUST BODY DATAYTPE 

[0130] (%UrTYPES;) "LONGTEXT"#FIXED> 

[0131] The external entity UITYPES contains the hst of 
all GUI widgets known to the editor. These data types 
include: 

[0132] DATE — widget accepting only a date entry. 

[0133] INTEGER — widget accepting only a numerical 
entry. 

[0134] STRING — a one line text box widget, 

[0135] SHORTTEXT— a short multi-line text area wid- 
get. 

[0136] LONGTEXT— a long multi-line text area wid- 
get. 

[0137] CHOICE — a drop-down menu that stores user's 
selection. 

[0138] ASSOCLIST— a drop-down menu that stores 
code corresponding to user's selection. 

[0139] BROWSESERVER— a widget enabling direc- 
tory browsing on the server. 

[0140] BROWSELOCAL— a widget enabling directory 
browsing on the local machine. 

[0141] LABEL — a non-editable widget displaying the 
name of the clement. 

[0142] In another embodiment, additional types may be 
used. 

[0143] A widely used interface widget is the drop-down 
menu. To accomplish this, the DATATYPE attribute is set to 
the UITYPE CHOICE, and the CHOICES attribute to a 



default value from a list of options. The options can be 
defined as an extemal entity for reuse across many DTDs. 
For example, 

[0144] <!ENTITY % CATEGORYDEFS SYSTEM 

[0145] "http:/Iserverldtd/categorydefs.txt"> 

[0146] defines an external entity for a set of category 
choices. 

[0147] These choices could be defined as the types of IBM 
Netfinity™ Servers: 

[0148] NONE I Netfinity_8500R | 

[0149] Nctfimty_7000_M10 | Netfimty_5500_M10 

[0150] Netfimty_5600 | Netfinity_5500 

[0151] The definition for CATEGORY in the DTD might 
then be: 

[0152] <!ATTUST CATEGORY 

[0153] DATATYPE (%UITYPES;) 
"CHOICE''#nXED 

[0154] CHOICES (^CATEGORYDEFS;) 
"NONE"#REQUIRED> 

[0155] The content editor creation algorithm assumes that 
if the first word in the set of CHOICES is the string NONE, 
and the user selects it and the element is optional, the XML 
element will not appear in the document. 

[0156] In a DTD, elements can either be required, 
optional, or occur 1 or more or 0 or more times. If an element 
can appear more than once buttons appear next to the widget 
or group of widgets for replication, as shown in FIG. 9. The 
buttons aUow the user to repeat a group of GUI widgets 
more than once or to remove a repeated group of interface 
widgets. 

[0157] In the present invention, auxiliary lookup tables 
further expand the definition of the DTD, beyond what the 
DTD syntax permits. These lookup tables arc encoded as 
XML files which are read by the client GUI into a hash table 
for fast access to the information. An auxiliary lookup table 
can store various additional information. In one embodi- 
ment, the lookup table stores the DATATYPE values for 
each DTD element. In another, a lookup table stores all 
translations of element names and help strings, as well as the 
labels in the GUI, to a given language. More specifically, 
when a user logs in and the GUI is initialized, the default 
language in the user's profile determines which translation 
lookup table to load. The GUI uses the lookup table to 
display all labels, DTD element names and help strings in 
the appropriate language. In yet another embodiment, a 
lookup table stores a more user friendly display name for 
DTD elements, to help make the GUI more approachable by 
a non-technical editor. The auxiliary file could be used for 
further information not limited to the types of information 
listed above. 

[0158] Using the client editor GUI 702 the editor logs into 
the system 114, the interface is cxistomized to the particular 
roles of which the editor is a member and to the default 
language specified in the user profile. The GUI 702 provides 
a "point and click" interface to an editor so that the exact 
requirements and syntax of XML are hidden. The editor can 
choose to create new document from the lists provided in the 
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interface or search for existing documents to edit. Typically, 
the editor will begin by creating reusable information frag- 
ments, such as images, video, sound and other multimedia 
assets, and other reusable data such as technical specifica- 
tions or descriptions. After the editor has created these 
fragments, composite fragments can be constructed. Refer- 
ences to the reusable fragments previously constructed will 
be included in these new composite documents. 

[0159] Turning to FIG. 8, shown is a GUI 800 to enable 
the creation/modification of multimedia content, according 
to the present invention. In this example, the GUI 800 is 
divided into two major areas. The left panel of the GUI 802 
displays a working set of document fragments and the right 
panel 820 is an editor pane editing a specific image frag- 
ment. Suppose in this example, the editor is a product 
manager for a line of portable computers, the IBM Think- 
Pad™. The product manager may wish to create a new 
fragment (i.e. a portion of a Web page or Web pages) 
detailing the new portable computer offering. Using known 
relational database techniques, a database 712 is searched 
for content that may be usefiil to the product manager. The 
search may be by category, by keyword, by title, by author, 
by last modification date and any other searchable field in 
the database. The left panel illustrates the partial results of 
a query in the database 712. Shown is the left panel divided 
into four areas, title 804, doctype 806, revision date 808, and 
creator 810. Shown selected here is a row of information 812 
. In this example, the product manager is creating a new 
image fragment and enters content to the fields 820-832 
including the directory to save the file 828, the name for the 
file 830 and a pointer to the image 832 to be uploaded from 
the local machine to the server. 

[0160] FIG. 9 is a GUI 900 illustrating how elements 
presented can be replicated, according to the present inven- 
tion. The -/+buttons 902-910 are used to add and remove 
widgets from the GUI 900, and as a result, elements in the 
XML file. For example the software category 928 may have 
more than one entry for a given product description. Retum- 
ing to the prodiict manager example for IBM ThinkPad™ 
there may be one or more applicable hardware options such 
as "AS400™ Servers and Workstations"922 and "Monitor 
and Displays"924. The creation of these forms is based 
directly on the DTD. It is important to note that in both 
FIGS. 8 and 9, the specific syntax of XML is hidden firom 
the user/editor thus simplifying the interface. 

[0161] Because of the strict way that the interface is 
constructed, each widget knows whether or not it is required 
and whether or not more elements can be added to an XML 
instance. If an clement in the DTD is required, the widget 
wiU be highlighted (e.g. colored brightly) to allow the user 
to distinguish which fields must be filled in before submis- 
sion. Therefore, only well-formed and valid documents are 
submitted to the server. 

[0162] Although the present invention uses existing XML 
technologies and standards with, newer standards, such as 



XLink and XML Schema, and technologies based on those 
can be leveraged to improve the design and the implemen- 
tation of the present invention. As it should be understood 
that the user of those technologies are within the true scope 
and spirit of the present invention. 

[0163] In yet another embodiment a number of features 
including automated extraction of keywords, automated 
translation and a Web -centric client that requires no instal- 
lation and can easily be accessed from any browser. 

[0164] Object Oriented GUI 

[0165] Each Java widget is encapsulated in a set of classes 
that include additional functionality. This object-oriented 
approach allows for modular design and future extensions to 
the set of interface widgets. Inheritance and generic methods 
are used throughout the class hierarchy for the definition of 
the interface widgets. Each UlTYPE may also provide very 
specialized functionality. For example, BROWSELOCAL 
and BROWSESERVER provide a button which, when 
clicked on, opens a dialog to choose a file on the local 
system or a directory on the remote server, respectively. This 
functionality is encapsulated within these particular classes. 
These widgets are illustrated in FIG. 8. 

[0166] UlTYPE LONGTEXT element tags are also 
handled specially within the system. The system assumes 
that UlTYPE LONGTEXT tags may be composed of one or 
more PARAGRAPH tags. Blank lines in the input are 
interpreted as paragraph separators. When constructing the 
XML document, these PARAGRAPH tags are automatically 
composed within the outer UlTYPE LONGTEXTtag, This 
functionality is inherited through the text widget class 
hierarchy. In general, this functionality can be enabled or 
disabled as the application requires. 

[0167] Process Flow for Client Editor GUI 

[0168] FIG. 13A is a process flow 1300 for the client 
editor GUI 702 that builds the GUI interfaces as shown in 
FIGS. 8 and 9 used in the overall process flow 1000 of FIG. 
10, according to the present invention. 

[0169] When launching the GUI interface, the user enters 
a user name and password. Based on the roles assigned, the 
user is authorized to create certain types of documents. Only 
authorized document types appear in the user's GUI. For 
example, someone outside of accounting would not be 
authorized to create a bill, 

[0170] Get DTD & Parse DTD— 1302-1306 

[0171] The process begins with step 1302 with the user 
selecting from a menu a document type that they wish to 
create. Once the user makes a selection the corresponding 
DTD is retrieved from the file system 714 in step 1304. Next 
in step 1306, the DTD is parsed. One parsing tool which has 
been used is Xerces (refer to onhne URL http://xmL apa- 
che ,org/index .html for more information,). 



Type and context infonnatioo - 1308 

Function - For every element in the DTD, the following information is determined: 
1) its location in the hierarchy (^its XPath); and 2) type information fox 
DTD elements. 

Ou^ut- Typz (e.g., a single line of input, multi-Unc input, choice clement, etc) 
and context (XPath) information for each element in the DTD. 
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-continued 



Mapping laformation For lype and Context - 1310 

Function - Given a DTD clement, its type and its XPath, the system maps this 
input information to the GUI values for generating the interface for 
that element The system uses the editor's user profile and lookup 
tables to determine the values. These GUI values include but arc not 
limited to: 

1) the type of input widget to di^lay in the interface, (e.g. simple 1- 
line string, multi-line text area, drop- down menu, directory browser for 
server, directory browser for local machtne, etc). 

2) the name to display in the interfece, translating the element name 
to user friendly text in the user's preferred language using a lookup 
table. 

3) the value of a help string to be made available in the interface if the 
user needs it (e.g., as a tooltip) in the user's preferred language. 

Input - DTD clement namc> its type and Xpath, and attributes from editor's 
user profile &om 1308. Output:GUI values to display DTD element. 
Generate GUI - 1312 

Function - Taking the input information, this step processes the DTD elements 
in order and recursively, while maintaining hierarchical inclusion, 
generates the GUI 702 as a set of interface widgets to be edited by 
the user. The hierarchy can be represented by indentation within the 
interface to indicate when one item is included by another. During 
this recursion, the process maintains a link between Uie interface 
widget and the corresponding element in the XML document under 
creation. If the interface is constructed for an existing XML document, 
the previously stored content is supplied to be displayed in the 
widgets. An existing XML document may also contain more than one 
occurrence of an element If so, the process adjusts the interfece 
accordingly and adds the elements. Also, the process maintains and 
displays information about whether an element is required or not in 
the final document. This information is used in the test in Check in 
step 1324. If an element can occur more than once in the interfcice, 
affordances are placed in the interface (i.e., "+/-" buttons) so that the 
user can easUy repeat or delete repeated elements from the XML 
document being created/edited. 

Input: the GUI values to display DTD elements from 1310. Content from 
1314 if editing an existing document 

Output: - the interface to display in either a web-based client or standalone 
Java client, with content if generating from an existing XML 
document 

Content from Existing XML Document - 1314 

Function - This step incorporates the content of an existing document into the 

GUI being constructed. 
Input - XML file from file system 714. 
Output - The content to be displayed in the inter&ce, 



[0172] Display GUI— 1316 

[0173] The results of the user input are then used to 
generate the GUI 702 with all the GUI widgets and user 
input from steps 1302-1312. 

[0174] FIG. 13B is a process flow 1320 for the client 
editor GUI 702 that checks-in the document after it is 
constructed into the process flow 1200 of FIG. 12, accord- 
ing to the present invention. The editor enters content for an 
XML document using the widgets in the GUI in step 1322. 
Once the user is satisfied with the document, the user 
checks-in the document in step 1324 or 1202 of FIG. 12. 

[0175] Create XML Document from GUI Widgets— 1330 

[0176] Function — The process extracts the content 
from the GUI widgets and places it into the XML 
document being constmcted. This is accomplished 
by looping over the hashtable to get each widget and 
its corresponding XML element, extracting the con- 
tent from the GUI widget and placing it into the 
XML element. To do this we encapsulate this infor- 



mation in the interface object with generic GET and 
SET methods. This allows us to call a standard 
method, independent of type, on the interface object 
to get user input and place it into the XML element. 

[0177] Input — ^XML document being created or 
edited and the hashtable that stores the GUI widgets 
and their corresponding XML element. 

[0178] Output — An XML document that represents 
the complete document filled in with the content 
from the GUI widgets 

[0179] Check-In Process 1324-1336 

[0180] In step 1326 a test is made to determine if the 
document is valid, that is, if all the required fields are 
fiUed-in. If any required field is not filled in the user is 
notified in step 1328, otherwise the process continues onto 
step 1330. In one embodiment, user is also notified if certain 
required fields that have choices such as "not appUcable" or 
"none" are not filled-in. An XML document is created from 
the GUI widgets in step 1330. In step 1332 any empty 
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Optional elements are removed and in step 1334 any optional 
categories set to values such as "not applicable" or "none" 
are also removed. Lastly the document is submitted to the 
server 114 for processing as described in step 1212 of FIG. 
12. 

[0181] Dispatcher— 704 

[0182] The Web application consists of four servlets and 
three subcomponents. The main servlet is the dispatcher that 
coordinates the activities of all subsystems and interfaces 
with the client application. The source and sink servlets 
allow Trigger Monitor to retrieve fragments from the file 
system and write assembled pages to it. The admin servlet 
provides for administration and monitoring functionality. 
The three subsystems interface with the metastore 712, the 
fragment dependency store 716 and the file system 714 
respectively. 

[0183] A dispatcher 704 which is a Web application run- 
ning within the Web server 114 that coordinates the activities 
of all subsystems and interfaces with the client application. 
The source and sink servlets allow fragment dependency 
store 716 to retrieve document fragments from the file 
system 714 and write assembled pages to it. The dispatcher 
704 consists of a number of servlets and three subcompo- 
nents: (1) metastore manager 710; (2) file system manager 
708; and (3) content manager 705. TTie main servlet is the 
dispatcher that coordinates the activities of all subsystems 
and interfaces with the client application. 

[0184] Metastore Manager— 710 

[0185] A MetaStore Manager 710 that provides an inter- 
face (e.g. Java DB/2 interface) to a database 712 that stores 
the meta-information about the assets stored in the file 
system 714. The metastore 712 maintains information about 
the functional and semantic role of each item of content. The 
metastore 712 also supports fast searches of content and 
maintains state information. The functionality of the metas- 
tore 712 is described in more detail in a later section. 

[0186] File System Manager— 708 

[0187] The file system 714 is where the components or 
assets for the documents are stored. A file system manager 
708 that provides a standard interface (e.g., SCSI, IDE, 
FDDI, TCP/IP) with a file system 714 where assets such as 
DTDs, XML fragments, Images, Documents, and HTML. 

[0188] Content Store Manager— 706 

[0189] A Content Store Manager 706, is an application, in 
this embodiment a Java application, that maintains the 
dependency information between assets i.e., XML servables, 
XML fragments, binary assets and XSL style sheets stored 
in the file system 714 and the fragment dependency store 
716. The fragment dependency store 716 is further described 
in a section below. The fragment dependency store 716 is 
designed to manage high aumbers of rapidly changing 
content fragments. By maintaining an Object Dependency 
Graph, and by detecting changes to content, it manages 
pages on a Web server in a timely manner. The fragment 
dependency store 716 allows the loading of specialized 
handlers to perform tasks specific to a particular application. 

[0190] MetaStore— 712 

[0191] The metastore 712 is used to maintain information 
about the functional and semantic role of each fragment. The 



meta-information stored in the metastore 712 is grouped into 
system-generated tags and non-system generated tags. The 
values of the system-generated tags are generated by the 
dispatcher when a check-in is successful. The values of the 
non-system generated lags are specified by the content 
creator during the creation of the corresponding document. 

[0192] The system-generated tags correspond to the chil- 
dren element of the SYSTEM element defined in every 
DTD, as described in an earlier section. The non-system 
generated tags correspond to additional elements in the 
DTDs that contain the content or are necessary for main- 
taining the functional and semantic role of the fragments. 
These tags can be further grouped into two parts: 1) the tags 
which are used for describing the XML object, such as 
keywords, categories and publishing information; and 2) the 
tags which hold the content of the XML object, such as 
TITLE and SUMMARY 

[0193] In one embodiment, the metastore 712 is imple- 
mented as a DB2/UDB database. In one embodiment, the 
metastore 712 is based on a fixed set of DB/2 tables for all 
fragment types, but can be extended to include specific 
table(s) for different fragments. 

[0194] IBM DB/2™ is a relational database, and thus 
cannot be used directly to store an XML object, beca\ise the 
XML object has a hierarchical data model. A mapping from 
XML data model to a set of database tables is needed. In one 
embodiment, DB/2 XML Extender 7.1 is used to map the 
XML document elements that correspond to the metatags 
into a set of pre-defined DB/2 tables. The DB/2 XML 
Extender is an IBM product developed to support the 
XML-based e-business applications using the IBM universal 
database— UDB. 

[0195] The XML Extender provides two access and stor- 
age methods in using DB/2 as an XML repository: XML 
column and XML collection. The XML collection access 
method decomposes XML documents into a collection of 
relational tables or composes XML documents from a col- 
lection of relational tables. These are exactly the operations 
required for the metastore 712, thus the access method used 
is the XML collection method. The XML collection imple- 
mentation of XML Extenders requires one DAD for each 
DTD that has to be mapped into DB/2. The DAD file is used 
to define the relationship between the XML tags to the tables 
of the relational database, 

[0196] A second embodiment consists of a programmatic 
mapping of the XML elements into the database columns. 

[0197] Search 

[0198] For a content management system that will poten- 
tially have a very large number of interrelated documents 
and fragments, finding and locating a particular fragment or 
servable efBciently becomes one of the major challenges. 
Accordingly, such an operation based on a directory struc- 
ture browsing operation is both inefScient and unreliable. 
The browsing operation is replaced with a search operation 
that leverages the meta-information that is stored in the 
metastore 712. One of the essential functions of the metas- 
tore 712 is to enable this search paradigm. 

[0199] The search feature requires implementation at both 
client and server sides. At the client side 102, the GUI 702 
provides a search dialog that allows graphical construction 
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of search queries. The search query consists of the conj unc- 
tion of elementary search conditions. The search conditions 
are created based on an initial XML specification sent from 
the server that specifies the searchable elements, the rela- 
tional operators Uiat can be used with each element, and in 
some cases the set of values that element can assume. The 
client converts the query into a DASL query. As it receives 
the response from the server, the search dialog parses the 
results and displays them in a tabular format. From the table, 
the editor can select items that can be used in the editor. 

[0200] At the server side 114, when the dispatcher 
receives the search query, it invokes the search module 
within the MetaStore Manager 710. The search module 
converts the DASL response into an SQL query dynami- 
cally, and queries the metadata database 712. It then converts 
the search result into DASL format and returns it to the 
client. 

[0201] In order to ensure the scalability of the application, 
a number of techniques have been used to streamline data- 
base access operations. First, a database connection pool is 
used to maintain a set of active connections, instead of 
creating a new connection for each access. Second, the 
search fields are indexed in the database to speed up search 
operations. Third, the search results are cached to minimize 
repeated access to the database for the same query from the 
same client 102. 

[0202] Fragment Dependency Store — 716 

[0203] The fragment dependency store 716 builds upon 
the Trigger Monitor technology from IBM Watson Research. 
In one embodiment, the fragment dependency store runs as 
a Java Virtual Machine 718. The fragment dependency store 
716 is designed to manage high numbers of rapidly changing 
content fragments. By maintaining an Object Dependency 
Graph, and by detecting changes to content, it manages 
pages on a Web server or cached in a network router in a 
timely manner. Trigger Monitor allows the loading of spe- 
cialized handlers to perform tasks specific to a particular 
application. One system for achieving maximum flexibility 
and reuse is disclosed in the patent application entitled 
"Method and System for EfBciently Constructing And Con- 
sistently Publishing Web Documents" filed on Apr. 4, 1999 
with application Sen No. 09/283,542 with inventors JR 
ChaEenger et al. now [Pending] and commonly assigned 
herewith to International Business Machines, which is 
hereby incorporated by reference in its entirety. In addition 
more information on Trigger Monitor is found in the fol- 
lowing publications which are hereby incorporated by ref- 
erence in their entirety: (i) Jim Challenger, Paul Dantzig, and 
Arun Iyengar. "A Scalable and Highly Available System for 
Serving Dynamic Data at Frequently Accessed Web Sites'7/t 
Proceedings of ACM/IEEE 5C98, November 1998; (ii) Jim 
Challenger, Arun Iyengar, and Paul Dantzig. "A Scalable 
System for Consistently Caching Dynamic Web Data."//t 
Proceedings of IEEE INFOCOM '99, Marcb 1999; and (iii) 
Arun Iyengar and Jim Challenger. "Improving Web Server 
Performance by Caching Dynamic Data.'7>» Proceedings of 
1997 USENIX Symposium on Internet Technologies and 
Systems J December 1997. 

[0204] The fragment dependency store 716 uses IBM 
Research's Trigger Monitor system to automatically propa- 
gate fragment changes to all affected fragments and serv- 
ables, and to allow for multi-stage publishing to acconuno- 



date quality assurance. The fragment dependency store does 
this by creating an Object Dependency Graph (ODG), a 
directed acyclic graph within Trigger Monitor, which rep- 
resents the inclusion relationships of all fragments in the 
system. 

[0205] Several Trigger Monitor stages are chained 
together to allow for multistage publishing. Trigger Monitor 
is written in pure Java running in Java Virtual Machine 718 
and implements handlers as pre-defined actions performed 
on the various configurable resources. Flexibility is achieved 
via Java's dynamic loading abilities, by more sophisticated 
configuration of the resources used by Trigger Monitor, and 
through the use of handler preprocessing of input data. Most 
entities defined in a configuration file implement a public 
Java interface. Users may create their own classes to accom- 
plish localized goals, and specify those classes in the con- 
figuration file. This permits run-time flexibility without 
requiring sophisticated efforts on the part of most users, 
since default classes are supplied to handle the most com- 
mon situations. 

[0206] In the present invention, several classes have been 
created for Trigger Monitor to implement three handlers: 

[0207] 1, the Extension Parser; 

[0208] 2. the Dependency Parser; and 

[0209] 3. the Page Assembler. 
[0210] Each of these classes are now described. 
[0211] Extension Parser 

[0212] Within the present invention. Trigger Monitor 
manages different types of files differently based on their 
extensions. Servables, simple, compound, and index frag- 
ments, stylesheets and multimedia assets are all treated 
slightly differently in the publishing flow. 

[0213] The Extension Parser takes in a name of a frag- 
ment, and returns an extension used in the Trigger Monitor 
configuration files to specify actions to take during the 
publish process. The appropriate behavior for each type of 
fragment is defined in the Trigger Monitor configuration 
files. These behaviors include moving assets to different 
stages within the system as well as assembling the servables 
into the expanded mode described in an earlier section and 
invoking the XSL transformation to create viewable pages. 

[0214] Dependency Parser 

[0215] The Dependency Parser analyzes an XML object 
and updates the ODG maintained by Trigger Monitor 
accordingly. The ODG maintains the dependencies between 
fragments. Currently defined are two types of dependencies: 
composition and style. The composition dependency main- 
tains stmctural information between fragments and between 
a complex fragment and its associated asset. The style 
dependency maintains information about the relationship 
between servables and stylesheets. 

[0216] Dependencies are considered to point from the 
subfragments to the fragments that include them. In the case 
of complex fragments, the dependency is from the fragment 
to the associated assets. 
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[0217] Page Assembler 

[0218] Trigger Monitor is configured to invoke in the 
present invention Page Assembler for servables. The Page 
Assembler assembles the servable into the expanded mode 
by including the contents of all included subfragments, and 
then invokes the XSL transformation engine to produce 
viewable output pages. As discussed in an earlier section, the 
first step of creating an expanded XML is a method used in 
the absence of a final XLink standard, and the lack of tools 
that handle XLink constructs. 

[0219] The type of the viewable page, as well as its target 
device, is determined from the stylesheet. The assembled 
XML and all the resulting viewable pages are written to one 
file, which is later split up, and the these pages are written 
to the appropriate directories on the server 114. 

[0220] Chaining of Trigger Monitor Stages 

[0221] Currently, two Trigger Monitor stages are used in 
the publish process. They share an ODG, and the sink of the 
first one is the source of the second, creating a publishing 
chain. 

[0222] When a fragment is checked in to the Content store, 
it is added to the shared ODG, and a publish command is 
issued to the first handler. Trigger Monitor reads the firag- 
ment XML from the source servlet, uses the extension parser 
to find its extension, and then uses the dependency parser to 
find dependencies to add to the ODG. The page assembler 
then pulls in the contents of the fragment's subfragments, 
and if the fragment is a servable, combines it with its 
stylesheets to produce the output pages (e.g., HTML files). 
The servable XMLs, output pages, binary files, and 
stylesheets — all fragments affected by the check-in — are 
sent to the servlet specified as the sink of the first handler. 
When a servable has been approved, a publish command on 
the servable fragment is issued to the second handler. It is 
reassembled and recombined with its XSLs, and the result- 
ing XML and output pages are published to the production 
Web server through a second sink servlet. Binary files (such 
as images) are also published to the second sirik. This is 
where the Web server pulls the final HTML and image files 
from. 

[0223] Detailed Process Flow— 1000 

[0224] FIG. 10 is an overall block diagram illustrating the 
process flow 1000 of the end-to-end pubhshing process 
according to the present invention. The following scenario 
describes how the system described here reuses information 
fragments and can easily update the presentation throughout 



a published information space (e.g., WebSite). There are at 
least four inputs that are needed to begin the publishing 
process according to the present invention. The four inputs, 
which in one embodiment are carried out by third party tools 
or in some instances manually prior to the process flow of 
the present invention are as follows: 

[0225] 1. Information Analysis and Modeling 1002. 
This provides information on *Svhat" the pubhshed 
Web site is about. This may involve building a site 
map, database modeling, and market analysis. 

[0226] 2. Target Audience Analysis 1004 are empiri- 
cal surveys on "how*' the information should be 
presented. It includes the choice of languages for the 
GUI to support a multi-lingual editor community, 
and the choice of languages for the final published 
content collection. 

[0227] 3. Target Device Analysis 1006 are empirical 
surveys on "where" or on "what device" information 
is presented e.g. a type of computer, a PDA, a cell 
phone, or other information processing device 
capable of presenting information to a user. 

[0228] 4. Workflow and Role Analysis 1008 

[0229] The four inputs above assist in defining how the 
information on the site should be organized and decomposed 
into reusable fragments of information. The analysis will 
directly impact the document templates, stylesheets, and 
auxiliary lookup tables that get constructed. In addition, this 
analysis will inform the process of defining the meta data 
that will be stored in the metadata database 712. 

[0230] The end result from this process inputs 1002-1008 
is an understanding of the set of document templates (e.g. 
DTDs) for all information fragments, a set of corresponding 
stylesheets (e.g. XSL), a set of lookup tables that store 
additional information on DTD elements including transla- 
tions, and a set of workflow roles that allow editors to access 
particular document types. 

[0231] Identify Meta Information, Servables and Frag- 
ments — 1010 

[0232] Next in process step 1010, aU the meta information 
to describe the content, that is any information helpful for 
indexing the content in metastore database 712 needs to be 
defined. Some mMeta information such as title, author, 
contents, revision date, and document type are indexed by 
default. This metadata is not only used for finding content 
during authoring on content editor 702 but is also used for 
personalization of the content during presentation in step 
1024. 



Function - Information architects and system designers identify the meta tags 

and document types that will be used throughout an implementation 
of this process. They determine the fragmentation granularity and the 
composition of each servable and fragment fiom subfragments. 

Input • The input is the results of the modeling and analysis from the external 
modules for information analysis, target audience analysis, target 
device analysis and workflow and role analysis. 
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-continued 



Output - The output from this step is infoimalioa to guide the construction of 
the metaetore 712, the document templates and the stylesheets 
constructed in steps 1012, 1014 and 1016. 
[nitialize MetaStore - 1012 



Function - A database administrator creates the metadata database(s) 712 and 
database tables. 

Input - Input is a database management tool and the results of step 1010. 

This includes the type of meta tags to be included in the tables within 

the metadata database 712. 
Output - The metadata database 712 is initialized and made operational. The 

tables and columns are setup in the database 712 that will allow for 

the storing and searching of documents within the system. 
Create Document Templates - 1014 

Function - A domain expert creates document templates that define the structure 
of the servables and fragments identified in step 1010. In addition, 
auxiliary lookup tables for DTDs as well as the DTD-to-database 
mapping files. 

Input - The input is the results of the information modeling and analysis 
modules (1002-1008) from step 1010. 

Output - Multiple document templates (e.g., DTDs or schemas) that define the 
structure of each document type. These templates describe the 
structure of each document fragment and servable and how the 
elements in the document are related, including how many times (1 
required, optional, 0 or more, or 1 or more, etc) the element will 
appear in the final document The loolcup tables contain more 
information on each DTD element, such as the type information for 
each element, help strings, and any translations to more user friendly 
names or other languages. The lookup table allows for the GUI to be 
automatically generated from the DTD. Fbrther files specify the 
mapping of DTD elements to database tables. 
Create Stylesheets - 1016 

Function - A designer creates the stylesheets that determine the presentation 
and layout of the information in each servable for each target 
audience and target device. 

Input - Results of the analysis modules, and results of step 1014. 

Output - The output is multiple stylesheets for each servable document for 
each specified device. 
Create/Edit and Compose Content - 1018 



Function - Authors and editors create content for the Web site. 

A more detailed description of this step with sub-steps is given in FIG. 
11. 

Input - Content creation interface 702, document templates, knowledge 

about the requirement for new content or about the necessity to edit 
existing content 

Output - Content files in file system 714, meta information in metastore 712, 
information about the content dependencies in the object dependency 
graph. 

Preview and Approve Content - 1020 

Function - Authors, editors and approvers view the output produced from the 

content using the selected stylesheets. 
Input - XML content and stylesheets along with the viewing interface on client 

editor 702. 

Output - The output is the fully rendered pages on the Web or simulated on 
various devices (e.g., PalmPilot ■™) to be reviewed by appropriate 
person in the workflow. 

Publish - 1022 



Function - Approvers and publishers publish the content to the presentation 
system. 

Input - Input consists of the content created in step 1018, stylesheets created 
in step 1016, and the knowledge that the servables are ready for 
publishing from step 1020. 

Ouqjut - Approved output pages are sent to the presentation engine. 



[0233] Presentation Engine— 1024 

[0234] Presentation engine such as IBM's WebSphere™ 
platfonn is used to present the resulting Web page. 



[0235] Details of Create/Edit Process Detail Flow— 1100 

[0236] The following is a further detail of the process flow 
1000 of FIG. 10 for the Creat/Edit Process 1018, according 
to the present invention. 
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Editor Selects Type of New Document - 1102 

Function - The editor selects the type of document to be aeated from a menu 
of possible types available for this person in the roles that they are 
associated with. 

Input - A list of the doctiment types that the particular editor can create. 
Output - The output is the selection of a particular document type to edit. This 
may be a fragment or servable document type. 
System Dynamically Creates a Blank Form - 1104 

Function - The system creates a blank form based on the document template for 

the particular document type chosen. 
Input - The xiscr selection from 1102 and the document type definitions from 

step 1118. 

Output - A form displayed in the client GUI 702 that allows the user to 

interactively add the content to the form. The form is based on the 
document template and only allows valid documents to be 
constructed based on the specification in the document type 
definition. 

Editor Searches and Selects a Document - 1106 

Function - The editor searches and selects an existing document using the 
metastore 712. 

Input - The search interface allows the user to specify the constraints of the 

specific documents they want to retrieve. 
Output - The output is the selection of a particular document to retrieve from 

the file system 714. 
System Retrieves the Document - 1108 

Function - TTie system retrieves the document 

Input- The input is the user's selection from step 1106 and the documents 

already created in the system. 
Output - The output is the XML document and its attachments (if any). 
System Dynamically Creates a Form and Fills it in - 1110 

Function - The system dynamically creates a form similar to the form aeated in 
step 1104. But in this case, the system automatically fills it in with the 
values of the elements from the selected document. 

Input- Input is the retrieved document from 1108 and the document 
definition from 1118. 

Output - A form displayed in the client GUI 702 , with the fields of the form 
initialized to the values of the elements of the retrieved documenL 
Editor Fills in the Form - 1112 

Function - The editor fills the form with content for the newly created document. 
Input - Input to this step is the form created in step 1104. 
Ou^ut - The output is the form with all required fields filled in. 
Search/Select Sub-Fragments - 1114 

Function - The editor searches for sub&agments and, if necessary, references 

them in the document being created/edited. 
Input - The search interface is used to find relevant sub fragments inserted 

into the document being created/edited. 
Output - The output is a reference to a subfragmcnt placed into the form of the 

current document. 
Editor Modifies the Form - 1116 

Function - The editor modifies the form of an existing document. 
Input - Input to this step is the content and form created in step 1110. 
Output - The output is in the form with all required fields fillcd-in. 
Editor Checks in the Document - 1118 

F^irther details are given in the functional block diagram of FIG. 12, 
Function - The editor checks in the created document 

Input - Input is the filled in document in the editor window from either creating 

a new document 1112 or editing an existing one 1116. 
Output - Output is the acknowledgement of the checkin process 1200. 



[0237] FIG. 12 is a functional block diagram 1200 of the 
check-in document process of FIG. 10, according to the 
present invention. 



03/31/2004, EAST Version 



us 2002/0152244 Al 



15 



Oct. 17, 2002 



Details of Editoi Checks in Document 1202 

Function - The editor checks in the document to save it in the system. 
Input - The form input from either a newly created document 1132 or a 

modified existing document 1116. 
Ou^ut - The output is an XML document that conforms to the dociunoit 

template for the specified document type. 
Save Documcat as XML File - 1204 

Function - The document is saved in the file system 714. 
Input - XML document ftom step 1202 is piovided as input 
Ou^jnt - The output is the XML file in the file system 714. 
Save Attachments - 1206 

Function - Any uploaded attachments (e.g., stylesheets, images, etc) to the XML 

document arc saved la the file system 714. 
Input ' The input is the content transferred to the server along with the XML 

document from 1204. 
Oulput - The output is the attachments saved in the file system 714. 
Save Meta Information in Metastore - 1208 

Function - Meta information from the XML is saved to the metastore database 
712. This includes automatically constructed data, such as user and 
modified time, as well as application specific meta tags such as, 
category definitions. 

Input - Hie XML file being saved is the input to this step. 

Output - The output is the meta data in the appropriate tables within the 
metastore database 712. 
Update ODG - 1210 

Function - The function of this step is to update the object dependency graph 
(ODG) with the various links between fragmeats. These liiOcs are 
inclusion links (e.g., sub fragments included within another fragment) 
and other links such as stylesheet links {e.g., links between 
stylesheets and their servables) 

Input - Input is the XML file from step 1208 with references to other 
fragments (e.g., subfragments oi stylesheets) 

Output - The output is an updated ODG with proper inteidependendes 
between fragments in fragment dependency store. 
Generate Preview Pages - 1212 

Function - The purpose of this step is to cache the preview pages so they are 
immediately available when editors/approvers want to preview the 
scrvable pages. 

Input - The update to the ODG 1210 triggers a publish of the servable pages 

from the XML file. 
Output - The output is the temporary preview files in the file. 



[0238] While the invention has been illustrated and 
described in the preferred embodiments, many modifications 
and changes therein may be affected by those skilled in the 
art. It is to be understood that the invention is not limited to 
the precise construction herein disclosed. Accordingly, the 
right is reserved to all changes and modification coming 
within the true spirit and scope of the invention. 

What we claim is: 

1. A method on an information processing unit performing 
steps for creating a user interface (UI) to assemble a docu- 
ment that conforms to a particular document type definition, 
the method comprising: 

receiving a user selection for a document type; 

selecting one of a plurality of document type definition 
types based upon the document type received; 

parsing one or more of a plurality of elements in the 
document type definition types selected; 

mapping to one or more interface controls each of the 
plurality of elements; 



presenting a UI editor by assembling the one or more 
interface controls without presenting specific document 
type definition syntax to a user; 

receiving a user input for zero or more content objects that 
are associated with the interface controls; and 

aggregating the content objects associated with the inter- 
face controls. 

2. The method according to claim 1, wherein the step of 
selection a phirality of document type definition types 
includes document type definition types selected from the 
group of document type definition types consisting of DTDs 
and XML Schemas. 

3. The method according to claim 1, wherein the step of 
presenting a UI includes presenting a UI selected from the 
group of UIs consisting of a graphical user interface (GUI) 
and an interactive voice response (IVR) system. 

4. The method according to claim 3, wherein the step of 
presenting a UI includes presenting a UI which is a what- 
you-see-is-what-you-get (WYSIWYG) interface. 

5. The method according to claim 3, wherein the step of 
presenting a UI includes presenting a UI which is a wizard. 
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6. The method according to claim 1, wherein the step of 
mapping includes interface controls selected from a group of 
interface controls consisting of an icon, a pull-down menu, 
a button, a selection box, a progress indicator, an on-ofif 
checkmark, a scroll bar, a window, a window edge for 
resizing the window, a toggle button, a form, and a UI 
widget. 

7. Hie method according to claim 1, wherein the step of 
parsing includes parsing one or more of a plurality of 
elements to determine a type and a hierarchical context and 
wherein the step of mapping to one or more interface 
controls includes mapping the type and context to one or 
more interface controls. 

8. The method according to claim 7, wherein the step of 
mapping further includes the sub -step of retrieving a user's 
profile to determine which of the one or more interface 
controls are mapped to each of the plurality of elements. 

9. The method according to claim 8, wherein the sub-step 
of retrieving a user's profile includes retrieving a user's 
profile from a group of user's profile information consisting 
of a national language, a user preference, an authorization 
and a preferred output device type. 

10. The method according to claim 7, wherein the step of 
parsing includes parsing one or more of a plurality of 
elements to determine a hierarchical context based on an 
Xpath. 

U. The method according to claim 8, wherein the step of 
parsing includes parsing one or more of a plurality of 
elements to determine a type selected from a group of types 
consisting of a single line input, a mxiltiple line input, a 
choice element, a pull-down menu, a button, a selection box, 
an on-off checkmark, a toggle button, and a UI widget. 

12. The method according to claim 11, wherein the step of 
parsing includes parsing at least one composite element 
comprising two or more types. 

13. The method according to claim 1, where in the step of 
presenting a UI editor includes assembling the one or more 
interface controls recursively, maintaining relational links 
between the one or more interface controls and each of the 
plurality of elements. 

14. The method according to claim 1, wherein the step of 
aggregating further includes the sub-step of: 

removing empty optional elements. 

15. The method according to claim 1, wherein the step of 
aggregating further includes the sub-step of: 

removing empty category elements, 

16. The method according to claim 1, wherein the step of 
aggregating further includes the sub-step of: 

submitting the assembled content object to be checked-in 
for subsequent processing. 

17. The method according to claim 16, wherein the 
sub -step of submitting the assembled content object to be 
chcckcd-in for subsequent processing includes being 
checked-in as XML. 

18. A method comprising steps on an information pro- 
cessing system to build a UI interface for creating a docu- 
ment based on a document type definition without present- 
ing the specific syntax of the document type definition to a 
user, the method comprising: 

receiving a user selection fro an existing document; 

determining the document type definition of the existing 
document; 



retrieving a document type definition wherein the docu- 
ment type definition comprises a plurality of elements; 

determining the type and context information based on the 
document type definition selection received; 

mapping for each element in the document type definition 
the type and the context; 

assembling the document type definition elements and 
any content from any preexisting document into a UI; 
and 

displaying the assembled document type definition ele- 
ments and any content in a UI. 

19. The method according to claim 18, further comprising 
the steps of: 

receiving user input to modify any content displayed; and 

modifying the content based on the user input. 

20. The method according to claim 18, wherein the step 
of retrieving a document type definition includes a document 
type definitions type selected from the group of document 
type definition types consisting of a DTD and a schema. 

21. The method according to claim 18, wherein the step 
of displaying includes displaying a UI selected from the 
group of UIs consisting of a graphical user interface (GUI) 
and an interactive voice response (IVR) system. 

22. The method according to claim 18, wherein the 
interface controls are selected from a group of interface 
controls consisting of an icon, a pull-down menu, a button, 
a selection box, a progress indicator, an on-off checkmark, 
a scroll bar, a window, a window edge for resizing the 
window, a toggle button, a form, and a UI widget. 

23. A computer readable medium containing program- 
ming instructions for creating a user interface (UI) to 
assemble a document that conforms to a particular document 
type definition, the programming instruction comprising: 

receiving a user selection for a document type; 

selecting one of a plurality of document type definition 
types based upon the document type received; 

parsing one or more of a plurality of elements in the 
document type definition types selected; 

mapping to one or more interface controls each of the 
plurality of elements; 

presenting a UI editor by assembling the one or more 
interface controls without presenting specific document 
type definition syntax to a user; 

receiving a user input for zero or more content objects that 
are associated with the interface controls; and 

aggregating the content objects associated with the inter- 
face controls, 

24. Hie computer readable medium according to claim 
23, wherein the programming instruction of selection a 
plurality of document type definition types includes docu- 
ment type definition types selected from the group of 
document type definition types consisting of DTDs and 
XML Schemas. 

25. The computer readable medium according to claim 
23, wherein the programming instmction of presenting a UI 
includes presenting a UI selected from the group of UIs 
consisting of a graphical user interface (GUI) and an inter- 
active voice response (IVR) system. 
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26. The computer readable medium according to claim 
25, wherein the programming instruction of presenting a UI 
includes presenting a UI which is a what-you-see-is-what- 
you-get (WYSIWYG) interface. 

27. The computer readable medium according to claim 
25, wherein the programming instruction of presenting a UI 
includes presenting a UI which is a wizard. 

28. The computer readable medium according to claim 
23, wherein the progracoming instruction of mapping 
includes interface controls selected from a group of interface 
controls consisting of an icon, a pull-down menu, a button, 
a selection box, a progress indicator, an on-off checkmark, 
a scroll bar, a window, a window edge for resizing the 
window, a toggle button, a form, and a UI widget. 

29. The computer readable medium according to claim 
23, wherein the programming instruction of parsing includes 
parsing one or more of a plurality of elements to determine 
a type and a hierarchical context and wherein the step of 
mapping to one or more interface controls includes mapping 
the type and context to one or more interface controls. 

30. The computer readable medium according to claim 

29, wherein the programming instruction of mapping further 
includes the programming instruction of retrieving a user's 
profile to determine which of the one or more interface 
controls are mapped to each of the plurality of elements. 

31. The computer readable medium according to claim 

30, wherein the programming instruction of retrieving a 
user's profile includes retrieving a user's profile from a 
group of user's profile information consisting of a national 
language, a ;iser preference, an authorization and a preferred 
output device type. 

32. The computer readable medium according to claim 

29, wherein the programming instruction of parsing includes 
parsing one or more of a plurality of elements to determine 
a hierarchical context based on an Xpath. 

33. The computer readable medium according to claim 

30, wherein the programming instruction of parsing includes 
parsing one or more of a pluraUty of elements to determine 
a type selected from a group of types consisting of a single 
fine input, a multiple line input, a choice element, a pull- 
down menu, a button, a selection box, an on-off checkmark, 
a toggle button, and a UI widget. 

34. The computer readable medium according to claim 
33, wherein the programming instruction of parsing includes 
parsing at least one composite element comprising two or 
more types. 



35. The computer readable medium according to claim 
23, wherein the programming instruction of presenting a UI 
editor includes assembling the one or more interface con- 
trols recursively, maintairung relational links between the 
one or more interface controls and each of the plurality of 
elements. 

36. The computer readable mediimi according to claim 
23, wherein the programming instruction of aggregating 
further includes the sub-step of: 

removing empty optional elements. 

37. The computer readable medium according to claim 
29, wherein the programming instruction of aggregating 
further includes the sub -step of: 

removing empty category elements, 

38. The computer readable medium according to claim 
29, wherein the programming instruction of aggregating 
further includes the sub -step of: 

submitting the assembled content object to be checked-in 
for subsequent processing. 

39. A system for creating a user interface (UI) to assemble 
a document that conforms to a particular document type 
definition, the system comprising: 

an input device for receiving a user selection for a 
document type; 

a file system for selecting one of a plurality of document 
type definition types based upon the document type 
received; 

a parser for parsing one or more of a plurality of elements 
in the document type definition types selected; 

a map for mapping to one or more interface controls each 
of the plurality of elements; 

a UI editor presented on an output device by assembling 
the one or more interface controls without presenting 
specific document type definition syntax to a user; 

means for receiving user input for zero or more content 
objects that are associated with the interface controls; 
and 

an assembler for aggregating the content objects associ- 
ated with the interface controls. 

* * * * * 
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